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ABSTRACT 
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T+ Introduction 


The intent ef this study is two-fold: 

1, to determine quantitatively the nature and amount 
of additional information presented by a stereo 
( as opposed to monoseopic ) visual apparatus i 
it* to investigate qualitatively some useful ways 
of incorporating this additional information 
in an, artificial visual scene analyser, 

It may be noted at once that th* only real distinction 
between stereoscopic and monocular vision in that the latter 
presents a single visual image of a scene, while the former 
provides us with two images. This distinction becomes quickly 
meaningless however, uniess a practical method exists of 
comparing the two images* and determining the differences 
between them, p or this reason, I am forced right at the 
beginning to address the question of w dlfference-measuring" 
between visual images* and to state explicitly the assumptions 
I have made concerning it, 

Wy first assumption is that at acme level cf even current 
vision programs, the image seen by a single eye is represented 
as & 2-D matrix of measured light intensity values, or could 
be so represented without much difficulty* 

My Second assumption is that if a stereo eye system were 
to be used, it would be mechanically* constrained so that 
the "center points" of th# 2-D image matrices were never 

* a variety -of feedback control systems* or even digital 
control systems can be imagined which might do this* and 
yet allow the constraint to be removed if desired. 


representative of different points in 3-£p&csi i*e. * that 
a "pQlnt^cf-trisonomatrio-focus ,f existed* towards which both 
eyes always '■pointed"* this focal point could freely shift 
In distance away from the eyes, or closer, but the forward 
axes of the eyes could not become significantly skew relative 
to the limits of angular resolution* The eyes would be 
capable of only slngle-degree-of-freedem motion with 
respect to each other, about their vertical axes* 

My third assumption* which Is difficult to justify Just 
yet* is that if an element In one eye's image matrix were 
selected* its counterpart in the other image could be found 
from local evidence such that both represented the same oolnt 
in 3-space. This is rather a difficult exercise in pattern 
matching in the general case, particularly since 1 under¬ 
stand that high noise levels are present in the visual images 
but T will offer some results in Part ill that can help 
quite a bit In limiting the search* T'll come back to 
this problem lateri for now T'll Just assume it is solvable* 
ifith these assumptions, we proceed to some mathematics 
relevant to the £ continuous } real-world situation. 

* Lerman (1) has In fact presented results which demonstrate 
that this type of pattern matching can be accomplished when 
applied to Images generated by eyes f-ocuSed on Infinity, the 
only case he considered. See Part III* 


13 * Definitions, Color'd, mat* Systems* and - Consequences 


Let us consider a fined orthogonal reference co-ordinate 
system S ( the ’table* system ), defined so that T is generally 
1 up 1 * is generally ’right 1 » and Tr is generally 'away’* 
presume that In this system* the point midway between the 
eyes is located out in the general -T* direction at „ and 
that the eyes are focused [ In the trigonometric sense J ert 
A group of objects to be viewed lies near the origin* and 
both P a and F s are Vcnown* 

The S system all by itself is adequate for representing 
the location of points in space, but It will be useful here 
to define a few more for clarity* One alternative is the 
system J 0 l see Fig* I )* whose origin lies at P E , and whose 
orientation Is such that 


To lies along J 0 * to 
To lies along Eo * T$ 
fro lies along Fe - Pa 


(almost ’up’}| 

(horizontalI almost ’right ’)j 
(toward# 


Transformation between these systems is easily made through 


the relation 

* s = to ^ + K 


( 2- I 1 


where 



-* UNIT ( ( Fj - p s ') * ( I , o. o'lj 4 - J- fs') 1 )-* 

C (Ps- p 5 V' U.°i °» —-» 

( F s -p a 1- 


(a-a\ 



THt }y?Tt/A J 
( JAf$LL CO-on OV^ 


Tht 3*37*** 3 

( FAcial CO-on i 



FlGoft* X 


It Will be assumed that our eyes In the J 0 system lie 
at ±D« where D ■* d( 0,1,0 ), and it will be noted that f 3 
in this eyetea appears to lies st F i f( 0,0,1 ), where 
f * 1(^8 - P a )J - ?huE, J D in hitmans eeems to he some sort 
of "facial* system, with E out the nose, and T out the top 
of the forehead.. Here we have Bade certain t though, that 
the eyes always focus on points only along the k-axie*, or 
"b traight-ahead", 

We will be Interested in finding from measured quantities 
the location of some point X s in S, and it will simplify 
Batters if we let E s = x a - F a , and then find that vector 
Instead, Wa note quickly that 

_ _ t ^ 

E s 5 Y.^- f a = T au ( z-^ j 

where 

a h. (*„ - F )M (.2 ^, 

and proceed to the pr obi eta of finding £ { non-dimensional 
displacement from ¥ In the * facial* system ) in terms of 
measureable quantities. 

The quantities we will measure oome from a comparison 
O'f the two images recorded by the eyes, I assume that these 
Images are formed by the projection of distant points onto 
planes perpendicular to rays between the eyes themselves and 

* It is possible to generalize and allow the eyas to focus 

on points other than straight-ahead* but the algebra becomes 
quite a bit more complicated. Since the "head* can be 
moved, this doesn l t seen, .i serious restriction. 



the focal point * f, Defining two now systems therb and J r * 

( see Fig. IT ) * with origins located, in at *15 and +15 
respectively., and oriented So that their k*axes point toward 
F and their 1-axes remain aliened with r p , we can find the 
locations of any point Y c ( in J Q ] In the new systems as 

K = L (%+*)* f Lu (A6 + F-o^ (**M 
J n - (VO) 1 f«*L A * ~ S ) 

where t ( the upper signs applying to the left eye* etc* } 
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If we define planes normal to Tt at ( Q*C P d) in both Ji 
and J r , and then project a point onto then* the Intersea- 
tiens will occur at 



In the 'left' system and plane* 






and at 
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( 2 - 66 ) 


in the 'right' system and plane. 

<t is merely a scale factor determined by the optics alone. 
The key quantities, which define the location of the projec¬ 
tions in the image planea of part T, are, from (2-4) an£ (2*6) + 


(SU = w»r (i,Q,Q> 1 

R (?: 0 i 5). .r { F ± 6 'i 

and 
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where again, and In what follows, the upper signs apply to 
the * left* system. 

It will be convenient to make several definitions, both 
to further non-dimensionallie the mathematics * and to save 
writing* tfe let 


and 
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and note that equations (2-5) now are 
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ami that equations after some matrix algebra simpli¬ 

fication, reduce to 


di 

ft 
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Their usefulness no* begins to become apparent , for with (2**8^ r 
we can solve for £ in tern& of p , A. s cj> : 


£, 
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It would be nice ( as I*va indicated by the weird forma 
of aquations (2*10} ) to simplify these with aone approxima¬ 
tions + since in most cases of interest, 

4> » i j i ( ^ «i ' 111 

We can't do this Just yet, though, because of the term A^ * 
whose magnitude is unclear* Ve can get a handle on lt t though, 
from aquation (£-3K where we noted that 



h- s ^ = ( 6 >, ’ 


actually roar he more useful to us than £ in some 

cases, ’because It represents the actual { dimensionless ) 
position of the point 7 a In facial co-ordinates * We note that 


and after some mathematics, wa find that 
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This quantity! however, can he reduced with [ 5 - 11 ) to 




^1 


and It is then clear that 


c^-p 1 


( 2 - 12 ) 


Several eases are possible and Interesting! 


A. -S s » f 3 


[ focusing oh Infinity ) - 


In this case = 'a large negative nurobar* • and 
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Equation (2*13d) la the only Interesting one, it allows us 
to find the depth of any point by Just focusing on infinity 
and measuring A t 


B ' I & J ^ ? & (any point of nearly equal depth 

with tha point of focus* ) 


In this oaae 


is a very small number #. 1 * and 
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Here it Is the first three which are of Interest* They allow 
tria simple calculations for the deduction of position relative 
to a Known focal point* <jp , 

C* £3 - ( see what follows } 

In this case* ^ p z ) > \ t and a look at aquations (2-l0> 
may cause some Mathematicians to worry about small denominators. 
Let them rest easily though, Physically this is impossible 
aince 


£^>^3 Implies implies £ O 


Negative focal lengths don't happen very often in practice. 



Summarizing the results of this section, then, we have 


shown that in facial co-ordinates, whenever 

(p > 7 1 
± 

A ^ ^ 

it la approximately true that 

]■ = ,f 

? ’ (*,/*,C^ p’)')| ij, ^ 5*- $ c *‘'^ 

The translation of these results froso the facial system to 
the table system may be made through the use of 

E, = AT io e (l-ii 1 

or 

^ dTso f + F= ( - 2 ’ ,a) 

where [ is given by equation (2-Z). 



III. Search Limitation in the Pattern Katchjng Problem 


I re Hilly have no right to jump into this aspect of the 
problem too far t because I don't know enough of the hardware 
limitations and capabilities , but ifi my first assumption 
holds i then what follows should not be too far off the track, 
and since Its important, I should say something about It, 

Tf light^lntenslty measurements are indeed representable 

in a 2*11 matrix for each eye, then information ally these 

* *■ 

matrices* ML and MR, will look as is shown in Fig* TTI* 

The matthing by local evidence of elements within these 
matrices Involves, I will assume, something prceedur&lly 
akin toi 

1* Plucking a local region out of one matrix. 

Z* Choosing an untested, region of the other matrix* 

3- Overlaying the local region on top of it, 

A'* Evaluating their local differences, 

5* Iterating steps 2-k until some cutoff occurs* 

6* Choosing the match with the smallest differences* 

I can*t really be less vague here without knowing more about 
the hardware and noise aspects of the problem, but Its easy 
to Imagine something like a ''minlmize-the-sum-cf—the-aquares- 
of-the-iifferenees-over-several-elements 1 ' approach, which 
would require that for seme local region of N X N (H odd) 
elements, 
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1m minimized by adjusting the b and 1* in the minimum caae , 
k and 1 would then represent half the quantized ( integer ) 
shifts f 



(_3 


wMah existed In the 'combined' image matrix * K at 



EegardlesB of the form taken by the 'evaluator* of step 4* 
however f other more basic questions remaini 


1. What is an acceptable cut-off criterion! 

Zw In what order dp we vary k and 1| 

3- How good la the answer we get? 

Without getting too involved* lt*s easy to make some relevant 
observations using the results of Part It* 

Equations (2-10}* for instance^ make use of the quantity 

( oin. + t^L 3 ci 

but do not make any reference to the analogous qtiahtiy 

This quantity may be shown* however* to be redundant and* 
more importantly* quite small* w rom (2-9] and (2-II}* 
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If no wish to lot Tc f “2, this becomes 
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Equation 0-2) defines a region ( see Fig lV ) In the t&a^e 
matrix* centered on the promotion of the focal point at (0*0} 
and strictly bounded except along the ares* If we stay within 
this region* we will be assured that shifts In the *C direction 
will be below the quantisation level* and hence In our search 
we may neglect all k except the trivial case of k * Q. 
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This limits our search within this region to one dimen¬ 


sion only* along , and states that if ever we wish to 

find A outside the region defined by (3*2) 

|«*pl < 4> 

we must either go to a more conu-lioated saarch or else 
shift the point of focus* Tire tradeoffs based on soeed 
would seem easy to develop. 

tfe can also note that the approximate value of A 

(approximate value of the shlft} should he predictable Sf 

* 

nearby shifts are already knownand If the region of 1- 
epaca corresponding to the local regions being compared 
contains no Step discontinuities in A ( depth). 

A should vary in a continuous and piecewise Smooth manner 
along f3, * except where 3-D object boundaries exist ] (see 
Part K)* Remaining "on" an object then, we would expect 
that 1 would be almost or exactly the same for adjacent 
points in the matrix, and we thus not only find a natural 
heuristic to help speed up search in these eases* but we 
have an immediate flag which signals object boundaries, 
whether or not a high level of noise would trigger this 
flag too often, is Something 1 haven't been able to real¬ 
istically figure out. 


This will always be true if w* seek A^j in the follow¬ 
ing orderi Start next to the focal point* ( where Cv = O ) 
and progress spirally outward* "Old" points will always be 
adjacent in the direction of the origin and "behind''* 



As far as cut-off i 5 concerned,, it is sort of hard to 
think of a heuristic that works equally well when the pre¬ 
dictions are "good"* and when they at* not* 

We might test predictions by looking exhaustively at 
same small number of points near the predicted one, and 
if the differences seem to increasing as we more 

away in either direction, the prediction, is probably "good"* 
and we can use the best match found from this small set of 
data, if on the other hand, the "match" doesn"t seem 
particularly good anywhere along this line, then It is 
likely that we have crossed a dlacontinulty-of-depth in 
the scene viewed, and w* have to do something strange* 

Perhaps it would be beat to keep looking over greater and 
greater areas for a match* but perhaps not* for It is quite 
possible that a match cannot be found! one must remember* 
after all, that near regions of depth-discontinuity, one 
aye seas thirty that are hidden to the other eye, 

what I would then propose for a & -finding algorithm 
would look, in a more refitied fora* something llkei 

1. Pick a new point of focus* 

Z, Plan an outward-spiralling path of examination 
beginning at the origin and remaning within the 
region given by 0-2j* 

3, Pick the next point on the path adjacent to a 
"good- point. 

to* predict ‘l’ at this point based on nearby values* 

5* Evaluate a small set of "overlays" shifted by about 2l'. 

6. If a definite beat fit exists near the center of 
this line it is probably "good", Record the shift* 
go back to step 3 again. 

7, The shift is not good* Record this fact, and go 
back to Step 3■anyway* 



As 30on no new points can fee found at step 3 * a region 
will have bean mapped out* and we can either stop or go 
hack to step 1 ,. depending on the information we need* If 
we go back* we record what we have been able to detect be¬ 
fore moving on.* 

I. have erne final comment to make on "noise 41 * in line- 
drawing type programs* noise means extraneous variations in 
light Intensity relative to '■average* over planar surfaces ( 
and so the easiest objects to work with are smooth and uniformly 
colored* In the kind of pattern recognition program I*we 
mentioned here* * smooth end uniform' blocks are obviously 
terrible to work with* for locally the only variations in 
Intensity are due to the invarse—square losses in light from 
a point source* What we really want for a local pattern-matcher 
is objects with lota of local detail { a light spray painting 
ralght be good >* The kind of noise we can't tolerate is 
variations between th* eyes* when they look at the same small 
region of space. Any i noise 1 picked bp consistently by both 
eyes will only make the pattern matching ( and depth-perception } 
more efficient* 


* It is interesting to compare this algoritm with the much 
more complete work of Lems an (U* Although derived independently* 
and applicable to different eye oonfigurationS* both attempt 
to deal with Similar effects* and the reader is encouraged 
to regard them as complementary* basically* lertaan obtain® 
a set"of possible 'matches' by comparing intensity differences 
between the shifted image elements with a fixed cut-off* 
and then refines this 'possible set' to remove ambiguities 
and eliminate spurious points. His results are conceptually 
encouraging* but when applied to actual images take a great dea-l 
of time. If a shifting point of focus and a goal-oriented 
measurement scheme were to be incorporated* it is possible 
that a more widely applicable set of programs could be 
generated * 




tv. General Remarks* and Figures 


So far this has been pretty mathematical* and It may 
be interesting ( and instructive ) to see what these results 
lock Ilk* when applied to human vision. People's eyes are 
about 2 P apart* and so in what follows* d *= l* ( and 
distances may be interpreted either as dimensionless or In 
inches* since the number* come out the same either way. 

( (A * 1 f ij) are always dimensionless.* however* ) 

If you hold a pencil up at arms length ( p * JJ m ) 
and focus on infinity, equation ( 2 - 15 ) predicts that 

A - ^ 

A S3 

and since this la one half the difference between your right 
eye's image and your left eye's* you should see the "two pencil 
tips* shifted apart by about * ~*D6. (The minus sign 

claims that your right eye Is responsible for the left image* 
You can check this by blinking* ) This gives you { or an 
uninitiated vision computer ) * handle on the size of the 
p -Scale, and I assume that the <&, -Seale is the same. { So 
far the restriction to y3r*l doesn't seem too limiting* ) 

Sow try looking at something nearby and heavily textured* 
like a flower or a crumpled piece of paper* when 1 did this 
I found my eyes "Jumped around" over the surface* making leap* 
of about |(ot*p )| * *03 at f>* 10* Equation 0-2) then fixes 
my approximate* limit of resolution eueh that 
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This is about 40 seconds of are { the thickness of a piece 
of newsprint at 20 feet )* which sounds like the right ballpark 
at least* Computer * eyes’ won’t be able to keep up with this 
kind of accuracy, and So we will have to eipect some major 
differences in performance from depth-sens itsvo programs 
linked to any realistic hardware, 

Pictures are also Interesting* and I* vs Included some 
In the following pages which I've taken the trouble to draw 
fairly accurately* One aside that strikes me as I look at 
them is that parallel lines don't cote out looking very 
parallel* ahd yet T seem to remember tha mention of some 
heuristics which made use of parallelism in line drawings* 
perhaps their authors made different assumptions than 1 have* 

These are line-drawing type pictures, even though a depth- 
sensitive program would be just as happy with curves* and 
wouldn't work at all without local detail in the planes 
themselves* I hope this doesn’t bother anyonej curves and 
Surfaces are hard to draw. 

Figure V shows the scene from the top, as a perspective!ess 


* T am aaCuming here that It is my ’depth-searcher* which is 
•driving my point of focus around the object* Actually it 
might not be uniquely responsible., but On very Irregular 
objects It seems likely it would be Important* Also* I have 
assumed that ay eyes *Jump’ only when they reach the edge of 
the region defined by (3*2h If something more conservative 
w 03 taking place* the limit of resolution would come out 
smaller* 




blueprint included only for clarity. The other f iff tires are 
self-explanatory. Things to look for Include the sign and 
magnitude of i over the image* the basic scale of the two axes 
of the image * and the ( very email ) effects of In the 
prediction of ^ Whenever A-pYqp is positive* the point 
Indicated is farther away than is the focal pointi When the 
reverse is true* it Is closer. Ml 'right eye' images are 


shown dashed, 
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Frr&r Analysis! Hesclutlon Capability 


v* 


Equations {2-10) represent solutions to relative 3~D 
displacement in terms of continuous and precise values 
of p * Approximations to thaee solutions 

have been given in equations (2-14) t and are valid 
when the conditions (2*11) are satisfied. It remains 
to te Seen, however. Just how accurate these approrimatIons 
are when fed the quantised data, of*p^ jjj* by a system 
with limits of resolution* S * This section will examine 
such questions t and produce first order error and uncertainty 
estimates * 

The errors Inherent to equations (2*l4> ^ay be written 
dcwn directly as the difference between the two sets of 
equations[ 
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these may be regarded. 


For any giTfln values of 
as functions of the variables 
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irhen qt e are small. 


ue can Solve exactly for the parttals in (5-3). and 
produce the results In terms of measured quantities i 
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^be constant term in (5“33 arises from the use of (2vil) 
and the assumption that £ 3 <* | ^ . To first order terns. It 
Taay be written as 
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Combining (5-M (5-5). and setting 


we obtainr 



or approximately* with 
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A still simpler form* 




With suitable restrictions on the use of the techniques 
of measurement* It would be expected that the first two 
terms in each of the above could be held arbitrarily 
ssall* The fundamental Itmltatton on a&euracy in position 
measurement, however, ie fixed by the .quantization level, 
and is represented by the third termsi approximately, 
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Vhll* this limitation is su*ll in Its effeot Oh horizontal 
and vertical position measurements. Its feffect on range 
resolution la not f and for ^ * *0G1» the limits of 
rsuiKe resolution near the focal point nay be found as a 
function of fa to be 
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